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RESIDUAL EMPIRICAL PROCESSES FOR LONG AND SHORT 
MEMORY TIME SERIES 1 
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This paper studies the residual empirical process of long- and 
short-memory time series regression models and establishes its uni- 
form expansion under a general framework. The results are applied to 
the stochastic regression models and unstable autoregressive models. 
For the long-memory noise, it is shown that the limit distribution 
of the Kolmogorov-Smirnov test statistic studied in Ho and Hsing 
[Ann. Statist. 24 (1996) 992-1024] does not hold when the stochastic 
regression model includes an unknown intercept or when the charac- 
teristic polynomial of the unstable autoregressive model has a unit 
root. To this end, two new statistics are proposed to test for the dis- 
tribution of the long-memory noises of stochastic regression models 
and unstable autoregressive models. 

1. Introduction. Let the time series {yt} be generated by the model 

oo 

(1.1) y t = (3'X t + e t and e t = ^a i e t -i 1 

i=o 

where X^s are a sequence of p-dimensional time series which are measur- 
able with respect to Ft-x = o~{et-i,£t-2, ■ ■ ■} or independent of {et}. The 
coefficients a» satisfy a 1 < oo; ao = 1 and = k H ~ 3 / 2 Lo(k) for some 
slowly varying function Lq [see Feller (1971)] with H < 1; and {et} is a se- 
quence of i.i.d. mean zero random variables with o~\ = Eel < 00 ■ The process 
{et} exhibits a long-memory (short-memory) phenomenon when HE (1/2, 1) 
[H < 1/2), which has been considerably studied in the literature; see, for 
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example, Robinson (1995a, 1995b) and the references therein. When model 

(1.1) is used to construct forecasting intervals or value-at-risk (VaR), knowl- 
edge on the distribution function F(x) of et is of crucial importance. This 
motivates the study on testing of F{x) and on related empirical processes 
of {e t }. 

When H S (1/2, 1), Ho and Hsing (1996) established a strong expansion 
for the empirical process of {et} in (1.1). Specifically, let 

(1.2) K n (x) = —Y}I{e t <x)- F(x)}, 

where /(•) is the indicator function and a 2 = var(^t=i et). They proved that 

1 ^<x, 

t=l 



(1.3) sup 



X 



K n (x) + —F'(x)J2^t 



o(l) a.s., 



n 



(1.4) al~ K(H)n 2H L 2 (n) and a' 1 J2 e t N(0, 1); 



t=i 



see also Taqqu (1975) and Hosking (1996). Herein, sup x = sup xeR , k(H) = 

J^°(x + x 2 ) H ~ 3 ^ 2 dx, a n ~ b n means that a n /b n — ► 1 as n — > oo and —> denotes 
convergence in distribution as n — > oo. By (1.3), 

-i 



(1.5) 



sup F'(x) 



sup \K n (x)\ £\N(0, 1)|, 

X 



if sup x |F'(x)| < co. This is the Kolmogorov-Smirnov test statistic of Ho and 
Hsing (1996) for testing the distribution F(x). Contrary to the standard 
weak convergence of the empirical process in the short-memory case, the 
result (1.5) is somewhat striking as sup x |if n (x)| does not converge to the 
maximum of a Brownian bridge as in the traditional case. Weak convergence 
of {K n (x)} was established in Dehling and Taqqu (1989) when {et} is a 
long-range dependent Gaussian process. Koul and Surgailis (1997) obtained 
some related results when H € (1/2, 1). Wu (2003) showed that (1.3) holds in 
probability under a weaker condition and a general setup and characterized 
the limit behavior of K n (x) when H < 1/2; see also Ho and Hsing (1997). 

Note that since {et} is unobservable in model (1.1), the Kolmogorov- 
Smirnov test has to be evaluated based on the residual process of {et}. In 
this situation, a key issue of interest is to determine the validity of (1.5) for 
the Kolmogorov-Smirnov statistic when {et} is replaced by its correspond- 
ing residual process. Furthermore, when (1.5) becomes invalid, how can one 
test for the distribution of {et}? These two issues have been studied exten- 
sively when {e t } is i.i.d.; see Bai (1994, 1996, 2003), Ling (1998), Lee and Wei 
(1999), Koul (2002), Lee and Taniguchi (2005) and Koul and Ling (2006) for 
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further discussions. But for model (1.1) and for the Kolmogorov-Smirnov 
statistic studied in Ho and Hsing (1996), these two important issues still 
remain unresolved. When (3'Xt is a constant and e% is an ARFIMA(p, d, q) 
model, the distribution of {et} can be determined by {et} once the parame- 
ters of the ARFIMA model are estimated. In this case, it would be sufficient 
to test for the distribution of {et}, for which standard procedures for residu- 
als from a model with i.i.d. noises, such as those given in Bai (1994) and Lee 
and Wei (1999), can be adopted. To study the general residual process of 
{et}, however, substantially different arguments need to be employed which 
rely heavily on the results of Ho and Hsing (1996, 1997) and Wu (2003). 

This paper first establishes a uniform expansion of the residual empirical 
process of {et} under a general framework. The result is used to study the 
stochastic regression model of Robinson and Hidalgo (1997) and the un- 
stable AR model of Chan and Terrin (1995), Truong-Van and Larramendy 
(1996) and Wu (2006). It is shown that the test statistic (1.5) of Ho and 
Hsing (1996) is no longer valid when the stochastic regression model in- 
cludes an unknown intercept or when the characteristic polynomial of the 
unstable AR model has a unit root. Our results not only encompass the 
long-memory {et}, but also the short-memory {et}. Furthermore, two new 
statistics are constructed to test the distribution of the long-memory noises 
in the stochastic regression model and the unstable AR model. 

This paper is organized as follows. A general result is given in Section 2. 
The residual processes of stochastic regression and unstable time series are 
presented in Sections 3 and 4, respectively. 

2. A general result. Let /3 n be an estimator of (3 in (1.1). Let i t = y t — 
P' n X t be the residual of model (1.1). Further, define the empirical process 
based on residuals {it} by 

1 n 

° n t=i 

For H G (1/2,1), a n is given in (1.4). For J2'jLo\ a j\ < °°i which implies 
H < 1/2, Ho and Hsing (1997) show that a 2 = Yvai n ^ 00 a' 2 l /n exists and is 
finite; see also Wu (2003). Let Go be the common distribution of {e t }. Write 
e t = e t + it-l and let A t (x) = G' {x - £ t _i) - E[G' (x - &-l)]> where = 
Ya^Li a i e t-i- Denote || • || = tr(M'M) for some matrix or vector M. We need 
the following two assumptions. 

Assumption 2.1. (a) H < 1/2 and a > 0, or H = 1/2, a > andX^oKI < 
oo, ot 1/2 < H < 1, and (b) Go is three times differentiable with bounded, 
continuous and integrable derivatives such that / x 4 dGo(x) < oo. 
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Assumption 2.2. Let S n be a p x p constant matrix depending on n 
such that the following statements hold: 

(a) 5- 1 n -f3) = O p (l), 

(b) ^3=^1^1=0(1), 

(c) *?T2=iEKX t \\* = o(l), 

(d) a" 1 sup x || E?=i ^(x) <MI = Op(l). 

Assumption 2.1(b) can be replaced by a general condition in Wu (2003). 
<5 n is the rate of convergence of $ n . Assumptions 2.2(b) and (c) automatically 
hold if 5" 1 = y/nlp and Xt is strictly stationary with £J||Xt|| 2 < oo, where 
Ip is the p x p identity matrix. As will be seen in Sections 3 and 4, S' 1 may 
not always be equal to \fnl p . Assumptions 2.2(b)-(d) are sufficient for the 
remainder term in the following expansion to be negligible, although they 
may not be the weakest ones. We state a general result as follows. 

Theorem 2.1. Assume that Assumption 2.1 and Assumption 2.2 hold. 
Then 

swp\K n (x) - K n {x) - R n F'(x)\ = o p (l), 

X 

where R n = a~ l n - P)'Et=i X = O p (l). 

Remark 2.1. According to this theorem, if R n = o p (l), then sup x \ K n (x) — 
K n (x)\ = o p (l) and, hence, sup x \K n (x)\ and sup x |if n (z)| have the same 
limit distribution. If R n ^ o p (l), then the limit distribution of sup x | jf n (a;)| 
may be different from that of sup x \K n (x)\ , as seen in Theorems 3.1 and 
4.1. When H G (1/2, 1), K n (x) can be replaced by -F'(x) Y,t=i £ t/o n - When 
if < 1/2 with EX t = or when H G (1/2, 1), 5~ l = y/nl p and {X t } is strictly 
stationary, then R n = o p (l). 

Remark 2.2. We require to have the form k H ~ 3 / 2 Lo(k) because we 
have to use the tightness condition of empirical processes of {et} of Ho and 
Hsing (1996) and Wu (2003) for H G (1/2, 1); and Theorem 3 and Corollary 
2 of Wu (2003) for H < 1/2. Without this condition, Theorem 2.1 is still 
valid if J2iZo l a «l <°° as l° n g as the empirical process of {et} is tight on R. 

Proof of Theorem 2.1. Let u n = 5~ 1 ( y 9 n - 0). Then i t = e t — u' n S' n X t 
and 

K n {x) - K n (x) F'{x)u'J' n X t 
° n t=i 

= — jy{e t <x + u' n 5' n X) - I(et < x) - F'{x)u'J n X t }. 
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To study the process K n (x) , consider the process 
1 n 

A n (x, u) = — < x + u'^Xt) - I(e t < x) - vl F> \x)8' n X t ] 



for all u £ BP and x £ R. By Assumption 2.2(a), if we can show that 

(2.1) sup sup | A n (x, u)\ = o p (l) for every A G (0, oo), 

ue[-A,A]p z 

then Theorem 2.1 is proved. Denote 

Z n (x,u) = — Y\I{e t <x + u'5' n X t ) - F(x + u'8' n X t ) - I(e t <x) + F(x)]. 



a ™ t=i 



By the triangular inequality, |A n (j;,ii)| < \Z n (x,u) \ + \H n (x,u)\, where 
1 n 

H n (x, u) = — YJ F & + u'S' n X t ) - F(x) - u'8' n X t F\x)\. 
° n t=i 

Since sup x |Gg(:c)| < oo, we have sup^ |F"(x)| < oo. Using this fact, Assump- 
tion 2.2(c) and the Taylor expansion, sup^.^A]? sup x \H n (x, u)\ =o p (l). 
To prove (2.1), it is sufficient to show that the following equation holds: 

(2.2) sup sup\Z n (x,u)\ = o p (l), 

ue[-A,A] p x 

for every A > 0. For each u£ BP and A £ B, let 

1 n 

Z n (x,u,X) = — ~%2[I(et <x + g t (u,X)) 

(2.3) 

- F(x + g t (u, A)) - I(e t <x) + F(x)] , 

where gt(u, A) = u'5' n Xt + A||<5(jAi||. For every 5 > 0, partition the rectangle 
[— A,A] P into m balls {C\, . . . ,C m } each with radius 5. Take one point in 
each C r and denote it by u r . For any uG C r , we have 

(2.4) \g t (u,X)-gt(u r ,X)\ < \\u - u r \\\\S' n X t \\ < 8\\8' n X t \\. 

Thus, gt(u r , A — 5) < gt(u, A) < gt (u r , A + 8) . Note that Z n (x, u) = Z n (x, u, 0). 
By the monotonicity of the indicator function, we obtain that 

1 n 

(2.5) Z n (x, u) < Z n (x, u r ,6) + —Y^[F(x + g t (u r ,8)) - F(x + g t (u, 0))] 

° n t=i 

and a reverse inequality holds when 8 is replaced by —5. Since sup x |Gq(x)| < 
oo, we have sup,,, (-^'(x)! < oo. By the mean value theorem, when u £ C r , 

1 n 

— J2[F(x + g t (u r , ±8)) - F(x + g t (u, 0))] 



G 

(2.6) 



< 
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Or, 



J2\9t(u r ,±5) -g t {u,\ 



t=i 



<^f\Kx t \\ = o p (s), 

where the last equality follows from Assumption 2.2(b) and the O p (l) holds 
uniformly for all x G R, all u G C r and all r = 1, . . . , m. 

Given any e > and i] > 0, by (2.6), there exists a 5\ £ > such that 



P < — max max sup 

o n r «ec r x 



Y\F(x + g t (u r , ±5)) - F{x + g t (u, 0))] 



>- \< V ~, 
~ 3 [ ~ 6' 



when 5 < 5\ £ and n — > oo. By Lemma A. 3, there exists a 62s > such that 



P< max sup \Z n (x, u r , ±<5)| > ^ ^ < 

I r x o 



P < max J3 n (u r , ±5) 



r 



J 3 n(Ur, ±8) > - > + - < - , 



when 5 < 62s and n — > 00 because m is an integer depending on 5 but not 
depending on n. By the preceding two inequalities, when 5 < min{(5i e , <5i e }, 



P< sup sup \Z n (x, u)\ > e > 
Iu£\-A,A]p z J 



< P< maxsup \Z n (x, u r , S)\ > - \ + Pi maxsup \Z n (x, u r , —5)\ > 



+ P < — max max sup 

O n r ueC r X 



I, 



I r x 



J2[F(x + g t (u r , ±5)) - F(x + g t (u, 0))] 



1=1 



< i], when n — > 00, proving (2.2). 



e 



□ 



3. Residual empirical process of stochastic regression models. In this 
section we apply the results in Section 2 to the stochastic regression model 
of Robinson and Hidalgo (1997): 

(3.1) y t = a + a'x t + e t , 

where £t is defined in model (1.1), xt is a g-dimension vector time series 
independent of {£4}, and f3 = (a®, a')' is a p = q + 1 dimensional unknown 
parameter vector. The least squares estimator (LSE) or generalized LSE 
of a is not asymptotically normal when both xt and £t exhibit long-range 
dependence; see Robinson (1994). Robinson and Hidalgo (1997) proposed a 
class of weighted LSE which is -y/n-consistent and asymptotically normal. 
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Let /(A) be the spectral density of St and <j>(\) be a real- valued, even and 
integrable periodic function with period 2ir such that ip(X) = </> 2 (A)/(A) is 
continuous. Denote <j>j = (2-7r)~ 2 4>(X) cos j'A dX. Robinson-Hidalgo's weight- 
ed LSE of a is defined as 



a, 



t=l s=l 



*52^2( x t - x )(y s - y)4>t- 



t=l s=l 



where x = Ya=\ x t/ n and y = Ya=i Vt/ n - Let 7j = E {^t+j) and K a bcd(s, u, v, 
w) be the fourth cumulant of x as , and x^, where x as is the ath 

element of x s . Recall the assumptions of Robinson and Hidalgo (1997) as 
follows. 

Assumption 3.1. (a) Y%Ufa < 00 and (£"=o l7il+^7n)[(E™=o^ 1/2 ) 2 + 
n$ n ] = O(n) as n — > oo, where 7 a = maxj> a 4> a = maxj> a \<fij\ and <& a = 
£|j|>al<A?l- 

(b) {xt} is fourth-order stationary, T u = E[(x\ — Ex\)(xi + \ u \ — Exi)'] — > 
and max|„| 5 |^| <00 |K a fccd(0, u,v,w)\ — > as |it| — > oo, 1 < a, 6, c, d < q. 

(c) £,/, is finite and £^ and E^, are nonsingular, where E x = x(A) (Iff (A) / 
(2tt) and -ff(A) is the Hermitian matrix such that Tj = /^ 7r e*- ?A dH(X). 

Discussions on this assumption, the choice of <fi and its computational 
procedures can be found in Robinson and Hidalgo (1997). Under Assump- 
tion 3.1, Robinson and Hidalgo (1997) showed that 

(3.2) y^^-^^iV^E^E^ 1 ). 
The intercept term cto is estimated by 

"On = V — a' n x = ao + e - (a n - a)'x, 

where e = YJt=i £ t/ n - When ff £ (1/2, 1) or H < 1/2 with Ex t = 0, we see 
that na~ 1 {a n — a)'x = o p (l) and hence, in these cases, we have 

(3.3) na~ 1 (ao n -a )-+N(Q,i). 

The results of Robinson and Hidalgo (1997) hold not only for long- memory 
{et} but also for short-memory {£*}. The following result entails the residual 
empirical process for both long- and short-memory cases. 

Theorem 3.1. If Assumptions 2.1 and 3.1 hold, then the results of 
Theorem 2.1 hold with (3 n = (ao n ,a' n )' , 5 n = diag(o" n n _1 , n~ l l 2 I q ) and X t = 
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Proof. It is readily seen that Assumptions 2.2(a)— (c) hold. Note that 



1 

— sup 

a n x 



t=l 



< sup 

x 



1 n 

a t=l 



+ 



1 



■ sup 

n x 



^A t (x)x t 



t=l 



To check Assumption 2.2(d), we only need to show that 



(3.4) 



1 



sup 



"■r i — 

x yJn(T n 



sup 



^A t {x)x t 



t=i 



oJl). 



Similarly, it can be proved that sup^, | Y^t=i At(x)\/n = o p (l). Since 
supa-IG^x)! < oo implies lim^^^ G' (x) =0 [see Lee and Wei (1999)], we 
see that E sup^ >M {G' (x — £t-i)\\xt\\} ^ as M — > oo. Since \fnjo n = 
0(1), for any given e > 0, there exists a constant M > such that 



(3.5) 



P sup 

\\x\>M 



1 

—= — S2A t (x)x t 



> r] 



<^E sup {G' (z-£t_i)|M}<e, 
a nV \x\>M 



uniformly in n. Partition [— M, M] into m = [AM 5 1 ] subintervals such that 
— M = Co < ci < • • • < Cm = M with c r+ \ — c r < 5 for any given constant 5 > 
0. Let U nr = {^ia n )- 1 Y;?=iMcr)xt- When #€(1/2,1), \\U nr \\ < 2n~ 1 / 2 ~ H x 
X]"=i ll^tll = °p(l)- When H < 1/2, since A t {c r ) and xt are independent for 
each c r , we can show that U nr = o p (l). Thus, we have 



sup 

|x|<M 



1 

-^^TA t (x)x t 



(3.6) 



< max sup 

r x€[c r ,c r+ i] 



71(7, 



J2iMx) ~ A t (c r )]x t 



t=i 



+ max ||f/ n 



<2,5sup|G[ ) / (x)|O p (l)+o p (l) 

X 

= O p {5)+o p {l). 
Using (3.5)-(3.6), (3.4) is established. □ 



We see that R n = O p (l) and K n (x) = O p (l). When Ext = 0, we have 
R n (x) = na~ 1 (ao n — ao) ^ o p (l) by virtue of (3.3). In this case, the estimated 
mean affects the limit distribution of K n (x) by Theorem 3.1. By (1.3) and 
(3.3), we have the following result. 
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COROLLARY 3.1. If Assumptions 2.1 and 3.1 hold and He (1/2,1), 
then 

sup F'(x) 

. x 

Remark 3.1. This corollary gives a statistic for testing the distribu- 
tion of the long-memory noises in model (3.1) when ao is unknown. The 
asymptotic variance of this test statistic is four times bigger than that in 
(1.5), which reflects the effects of the slower convergence rate of the esti- 
mated parameter 6tQ n . When oq is known, the test statistic (1.5) is still 
valid, however. As pointed out by the reviewer, when F = F(x, 9) involves 
an unknown parameter 9, one should consider K n with F(x) being replaced 
by F(x,9 n ). Under such circumstances, the limit distribution of the statistic 
is usually different from that of Corollary 3.1. This fact serves as a remi- 
niscence of the classical Kolmogorov-Smirnov statistics problem when the 
underlying parameters are estimated; see Durbin (1976). When H < 1/2, 
it can be shown that the limit distribution of the statistic exists by means 
of the result of Wu (2003). The closed form of such a limit distribution is 
rather complicated and does not possess a simple expression, however, and 
is not presented here. 

4. Residual empirical process of unstable AR(p) models. This section 
considers the unstable AR(p) model with starting value {yo, y-i, ■ ■ ■ , y_ p +i} 
independent of {e s : s < 0} such that 

(4.1) yt = (3'X t + e t , 

where X t = (yt-i, ■ ■ ■ , Ut-p)' ', P = (<j>u ■ ■ ■ 1 4>p)' \ an d the characteristic poly- 
nomial (f)(z) = 1 — <piz — • • • — (j) p z p has the decomposition, 

I 

(4.2) <j){z) = (1 - z) a (l + z) b Y[ [(1 - ze idk )(l + ze l9k )] dk , 

k=l 

a, b, I, dfc, k = are nonnegative integers, p = a + b + 2(d\ H he?z), and 

{et} is defined in model (1.1). Here, a denotes the multiplicity of the root 
z = 1 for <p{z) = 0. Same interpretations are given to b and I. We estimate (5 
by the LSE: 

(n \ -1 n 

t=i ) t=i 

For the special case with <\>{z) = 1 — z, Wu (2006) obtained the limiting 
distribution of (5 n under Assumption 2.1(a); see also Sowed (1990) and Wang, 
Lin and Gulati (2003). For the general case, the limit distribution of (3 n 



sup|£ n (x)|4|jV(0,4)|. 

x 
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was obtained by Chan and Terrin (1995) and Truong-Van and Larramendy 
(1996) under the following Assumption 4.1(a) and (b), respectively. It can 
be seen that Assumption 2.1(a) is much weaker than Assumption 4.1. 

Assumption 4.1. (a) L (j) ~ c, c is a constant, H G (1/2, 1) and e t ~ 
iV(0,<rf), or (b) Y,jt j\aj\ < oo and a > 0. 

Let S n = G' J~ l , where G is the constant matrix given in Chan and 
Wei (1988) and J n = diag(iVi, N 2 , . . . , N l+2 ) with 2V"i = diag(n, n 2 , . . . , n a ), 
N 2 = diag(n, n 2 , . . . , n b ) and Nk +2 = diag(rtl2, n dk I 2 ), k = 1, . . . , I. Define 



fe(r) = [/o(r),...,/ -i(r)] / , f (r) = B H (r) and f j (r) = ^f j ^(s)ds, j = 



1, . . . ,a, where Bh{t) is a fractional Brownian motion with covariances 



We now state the results for model (4.1). 

Theorem 4.1. For model (4-1), if Assumption 2.1 holds with (f)(z) = 
1 — z, or if Assumption ^.l(a) holds, or if Assumptions 2.1(b) and ^.i(b) 
hold, then the result of Theorem 2.1 holds with R n = o p (l) for a = and 



for a > 1, where F = ( 7> 0, . . . ,0)J, xl , 7 = 1/2(1 - Ee 2 t /a 2 ), £ H = 
Jo £h(t) dB H (r), Q H = {^>ij)axa and = Jq 1 fi(T)fj(r) dr. 



Let D[0, 1] be the Skorokhod space and D p = DxDx---xD denote the 
p-Cartesian product space of D = D[0, 1]. To prove Theorem 4.1, we need 
the following lemma. Using the results in Chan and Wei (1988), Truong and 
Larramendy (1996) and Wu (2006), its proof is similar to that of Lemma 
2.1 in Ling (1998) and the details are omitted. 

Lemma 4.1. Let f = £ H if H e (1/2,1) and f = f 1/2 if H < 1/2. If the 
assumptions of Theorem 4-1 hold, then: 



E[B H {r)B H {s)] = \{s 2H + r 



2H 




for < s,t < 1. 




if H< 1/2, 




(a) 




in DP, ifa>l, 



(b) 



1 



XX* t =op(i) 



uniformly for all r G [0, 1] if a = 0, 



n t=X 
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(c) 



O, 



J2E\\8' n X t \\=0(l), 



(d ) ^^^||^X t || 2 = 0(l). 



t=l 

n 



n t=l 



Proof. For simplicity, we only prove Theorem 4.1 for <p(z) = (1 — z), 
that is, model (4.1) only has one unit root. The general case can similarly 
be proved by Lemma 4.1. When <fi(z) = (1 — z), S n = n" 1 and X t = yt-i = 
Y^ii=i £ i- By Theorem 6.1 of Chan and Terrin (1995) and Theorem 3.1 of 
Truong-Van and Larramendy (1996) or Theorems 3 and 4 of Wu (2006), 
Assumption 2.2(a) holds. By Lemma 4.1(c) and (d), we see that Assump- 
tion 2.2(b) and (c) holds. 

We now consider Assumption 2.2(d). First, note that Esup\ x \ >M A^(x) — ► 
as M-»oo and maxi<« n o~ 2 EX 2 = 0(1). Thus, for any given e > and 
n > 0, there exists a constant M > such that 



(4.3) 



P\ sup 

\|x|>M 



< 



no, 



■XX*)*t 



t=i 



> T] 



-Esup 



x\>M 



rjno n 



E^E\Xt\ 2 < 



t=i 



uniformly in n. Partition [— M, M] into m = [4M5 1 ] subintervals such that 
— M = xq < x\ < ■ ■ ■ < x m = M with x r+ \ — x r < 5 for any given 5 > 0. Thus, 



sup 

\x\<M 



no,, 



T.Mx)x t 



t=l 



(4.4) 



< max sup 



< max sup 

r X r -\<X<X r 



t=l 



no, 



Y^[A t {x) - A t (x r )]X t 



1=1 



+ max 



■J2 A t(Xr)X t 



1 = 1 



Jin + Jim 



say. 



Since sup^. |-AJ(:e)| < oo, by Lemma 4.1(c) and the Taylor expansion, we have 

1 



(4.5) 



Jin < 0{8) 



no, 



■El* 



t=i 



O p {5). 
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For J2n, we need the following decomposition: 



t=l 



n n 

i_ = _Lg 



.t=i+l 



u=l 



E^ 



t=l 



1 n 

— E 



EM* 



= Ui n {x) - U 2n (x), say. 

By the ergodic theorem, Ylt=i At (x) /n = o p (l) for each x. Furthermore, since 
Y^i=i £ i/ a n = O p (l), we have max r \Ui n (x r )\ = o p (l) for a given 5 > 0. 
We next consider U2 n (x). When H < 1/2, by Theorem 2 of Wu (2006), we 

know that Y^t=i At{x)/a n <S(r) in D for each x and X)|=i ^t/V^^ £( T ) 
in D, where S(t) and £(t) are standard Brownian motions. By Theorem 3.1 
of Ling and Li (1998), U2 n { x ) = o p {l) for each x and, hence, max r | C^2n (^r-) I = 
o p (l) for any given 5 > 0. Thus, Assumption 2.2(d) holds when H < 1/2. 
When H G (1/2,1), we decompose U2 n (x) as follows: 



(4.6) 



1 



E 



E^ 



Si + £ £ £i = ^n(^) + t^(l), 

i=l \i=l / 



" i=l U=i 

say, where Rt{x) = A t (x) — Gq (x)^t-i ■ For each x and any £ > 0, by Corollary 
1 of Wu (2006) [see also Theorem 3.1 in Ho and Hsing (1997)], we have 

I 2 



(4.7) 



E 



0(i 



max{l,4(/i"-l/2)+2C} 



E^( a 
,t=i 

By (4.7), for any rj > and 5 > 0, we have 

, \ 1 m 

Plmax|J7 3n (a; r )| > n < - Y, E \ u 3n(x r )\ 



)• 



v r 



(4.8) 



< 



1 



EE * 



E^( 3 



1/2 



Lt=l 

= O(n-U ~ 1 (n))^0, 
when n — > oo, where 7 = min{i7 — 1 /2, 1 — — > 0. Note that 



Uin(x) 



G>>{x) 



no r , 



Si. 



no r , 



i=l \t=l 



'« i=l \t=l 

By Theorems 3.2 and 3.3 of Chan and Terrin (1995) or Theorem 3 of Wu 
(2006), 



1 



i=l 



0"n 



B H {s)dB H (s). 



RESIDUAL EMPIRICAL PROCESSES 



13 



Thus, the first term in U± n (x) is o p (l) uniformly in x G R. Note that 

r 

Ya=i \ £ t\/ n = O p (l) by the ergodic theorem and maxi<j< n | J2t=i e t\/V™~* 
maxo< r <i \Bi/2(t)\. Since \fnjon = 0(n~ H+1 / 2 / Lo(n)) = o(l), the second 
term in Ui n (x) is o p (l) uniformly in x G R. Thus, we have maxj |C/4 n (x)| = 
o p (l). Furthermore, by (4.6) and (4.8), max r \U2 n (x r )\ = o p (l) for any given 
8 when H G (1/2, 1). Thus, Assumption 2.2(d) holds when H G (1/2, 1). □ 

Remark 4.1. From this theorem, we see that the empirical process of 
{et} is not affected if {et} is replaced by {it} when (f)(z) does not have a root 
equaling one. It has a profound effect when 4>(z) has a unit root, however. In 
particular, using Theorem 3 of Wu (2006), we have the following corollary. 

Corollary 4.1. If 4>{z) = (1 — z) and Assumption 2.1 holds with H G 
(1/2,1), then it follows that 
i -l 



sup F'(x) 



sup\K n (x) 



B H (1) + 



B H (T)dB H (i 











-l 




f 1 B H {r)dT 




l l B 2 H {r)dr 






Jo 




Jo 





Remark 4.2. Corollary 4.1 gives the limit distribution of the Kolmogorov- 
Smirnov statistic. It can be used to test for the distribution of the long- 
memory noises in model (4.1). For instance, using it as a proxy for Et, H 
may be estimated by Robinson's (1995a) semiparametric method. Although 
the asymptotic validity of such a procedure still needs to be examined, for a 
given H G (1/2, 1), the percentiles of the limit distribution can be tabulated 
by means of simulations. Corollary 4.1 thus provides a means to apply the 
Kolmogorov-Smirnov statistics to model (4.1). 

APPENDIX: TECHNICAL LEMMAS 

Let x r = rea~ l for any r G Z and some e > and decompose the real line 
R as R = \J r £Z [x r ,x r+ i]. Let gt(u,X) be defined in (2.3) and 

a nt {x) =I{e t <x + g t (u,X)) - F t -i{x) - I{e t < x) + G (x - £ t -i), 

where F t -i(x) = E[I(e t < x-£ t -i + 9t(u, A))|^="t_i] = G [x - £ t -i +9t(u, A)], 
u G [—A, A] p with A > and A G [—1, 1]. We have the following lemma. 

Lemma A.l. Let Zi n (x, u, A) = J2t=i a nt(x)/cr n . For every u and A, if 
Assumption 2.1 and Assumptions 2.2(b) and (c) hold, then: 

1 n 

(a) max max — y2\F(x r+1 + g t (u, A)) - F(x + g t (u, A))| = O p (e), 

r x£[x r ,x r+ i] a n 1 



(b) sup\Z ln (x r ,u,\)\ = o p (l) 



for any given e > 0. 
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Proof. By Assumption 2.1(b), F'(x) exists and is bounded; see Ho and 
Hsing (1996). Since n/a 2 = O(l), by the Taylor expansion, part (a) holds. 

For part (b), since J2t=i a nt(x) is a martingale array with respect to T n = 
a{ (et , X t ) , t < n} , by the Rosenthal inequality [see page 23 of Hall and Heyde 
(1980)], 



E 



(A.l) 



Lt=l 



<cE\^E[c?Jx)\^ x } +cJ2E[a 4 nt (x)] 



.t=i 



t=i 



<cnJ2E{E[a 2 nt (x)\T t . 1 }} 2 + 2cY / E[a 2 nt(x)} 
t=i t=i 

for some constant c, where we use a^ t {x) < 2a 2 nt (x). Denote gt(u, A) by gt and 
let Hf(x) = G (x-tit-i±\gt\). Since E[I(e t < x - £ t -i)\Ft-i] = G (x-Ct-i) 
and Gq(x) is nondecreasing, we have 

E[al t (x)\^!] < |F t _!(x) - G (x - £ t -i)| < H+(x) - Hf(x). 

Again, since Gq(x) is nondecreasing, for any positive integer M, we have 



E[H+(x r )-Hr(x r )} 

r=-M 



M 
r=-M 



(x) cix 



(A.2) 



< 



< 



Ht(x) dx - 

—El I H?(x)dx + j Ht{x)dx 

+ [ XM [H+(x)-Hr(x)}dx\ 

(j ( fiM r\gt\ ~) 
2 + ^E\ / G? (x-S t - 1 + y)dydx\ 
e VJx-m J— \gt\ j 

2 + ^e\[ M [°° G' (x-Zt-i + y)dxdy\ 



= 2 + ^E\g t \. 
Similarly, we have 

M M 

J2 E[H+(x r )-H^(x r )] 2 <c £ E{\g t \[H+(x r )-H t -(x r )}} 



(A.3) 



r=-M 



r=-M 



2cE\g t \ + 2 -^Egl 
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where c = 2 sup x G' (x) . Using (A.2)-(A.3) and Assumptions 2.2(b)-(c), 

, n M n 

Y,J2E[al t (x r )}<- lim J2 E i H ^ x r)-H t -(x r )] 



a 



(A.4) 



n r t=l 



n r) n q n 

1 E E E{E[al t {x r )\T t ^]Y < ™ Y E\ 9t \ + 4 E ^ = 



n r f=l t =l ""n t =l 

(A.5) 

as n/al = O(l). By the Markov inequality, (A.l), (A.4) and (A.5), 



P\SWp\Zi n (x r ,U,\)\ > f]J <Y P {\Z\ n {x r ,U,X)\ > 1]) 

1 



< 



E^ 



Y a nt(Xr) 



t=l 



as n — > oo, for any given e > 0. Thus, part (b) is proved. □ 

Lemma A. 2. Let Z 2n {x,u,X) = JJ=i[ F t-i(x) - G (x - - F(x + 
gt(u,X)) + F(x)]/a n . If Assumptions 2.1 and 2.2(b)-(d) hold, then 
Z 2n (x,u, X) = XJ\ n (x) + J 2n (x,u,X) such that sup x | Jin(x)\ = O p (l) and 
sup x sup u sup A | J 2n (u, x, X) | = o p (l) . 

Proof. By Assumption 2.1(b) and Lemma 6.2 of Ho and Hsing (1996), 
F"(x) exists and is bounded. By the Taylor expansion and Assumption 2.2(c), 



1 n r 1 

Z 2n (x,u, X) = — E Mx)9t(u,X) + ? <??K A)[Gg(Ci) " E"{e t -i 



(7, 



YA t (x)g t (u,X) + o p (l) 



i=l 



-E^(^)tt+op(i) 



= AJi n (x) + J 2n (x,u, A), say, 

where we use F'(x) = EG' (x - Ct-l), Ct-l = x — Ct-i + #ft(", A) and = 
x + 9g t {u,X) with 9,9 £ (0,1) and o p (l) being held uniformly in x,u,X. 
Since sup x |A t (x)| < 2, by Assumption 2.2(b), sup x |Ji n (x)| = O p (l). Since 
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u€ [— A,A] P , by Assumption 2.2(d), sup x sup u sup A | J2n{x, u, A)| = o p (l). 
The desired conclusion follows. □ 



Lemma A. 3. // Assumptions 2.1 and 2.2(b)-(d) hold, then it follows 
that 

sup\Z n (x,u, A) | < J 3n (u,X) + |A|J 4n , 

X 

where Z n (x,u,X) is defined in (2.3), < J3 n (u, A) = o p (l) for each u and X, 
and < Ji n = Op{\) is independent of u. 



Proof. Since I(e t < x) and F(x) are nondecr easing, for any x S [x r , x r+ i] 

1 n 

Z n (x,u,X) < Z n (x r+ i,u, A) H ^[F(x r+ i + # t ) - F(x + g t )] 

° n t=i 

1 - 

+ — y>( e * ^ x -+i) - F ( x r+i) - I{et <*) + Fix)], 
^ n t=i 

where gt denotes gt{u, A) and a reverse inequality holds when x r+ \ is replaced 
by x r . Since \Z n {x r+ \, u, A)| < \Z\ n (x r+ i, u, A)| + |Z2 n (x r+ i,u, A)|, we have 

sup|Z n (x,u, A) | < max|Z 2n (x r ,u, A)| + R n (u, A), 

where 

R n (u, A) = max \Z\ n (x r , u, A)| 

1 n 



+ max max — Y^\F (x r+ i + g t ) - F (x + g t ) 

r x£[x r ,x r+1 ] a n t=1 

(A.6) 

i n 



i 

+ sup 

\x 1 -x 2 \<ea^ 1 CT " 



-F(xi)-/(et <x 2 ) + F(x 2 )] 

For any e,r\ > 0, by Lemma 4.1(a), we can take e small enough such that 
the second term of (A.6) is less than rj happens with probability being at 
least 1 — e/4. For this e, the first term of (A.6) is o p (l) by Lemmas A. 1(b), 
and the last term of (A.6) is o p (l) by the tightness of the empirical process 
of {e t } of Ho and Hsing (1996) and Wu (2003). Thus, R n (u,X) = o p {\) for 
each u and A. By virtue of Lemma A. 2, the conclusion holds. □ 
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